Arabic Text Summerization Model Using Clustering Techniques
نویسندگان
چکیده
the current work investigates a developed automatic Arabic text summarization model. In this model, a technique of word root clustering is used as the major activity. Unlike the previously presented systems of Arabic text summarization in the extract based design field, the current model adopts cluster weight of word roots instead of the word weight itself. The model is thoroughly illustrated through its different stages. Obviously, the general scheme follows traditional descriptive model of most of the system stages in literature with the exception of the ranking stage. This model with its developed technique has been subjected to a set of experiments. Various Arabic text examples are used for evaluation purposes. The efficiency of the summarization is calculated in terms of Precision and Recall measures. Result obtained actually is considered promising and competitive to the verb/noun categorization ranking method. This enhancement has been detected for Precision 76% and Recall 79% with the analogous values of 62% and 70% obtained in the verb/noun categorization method. The enhancement emerges in this tangible result is attributed to the implicit embedding of semantic capability of the developed model to expand the extract boundaries towards the abstract extremes of the design theme.
منابع مشابه
Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملPre Processing Techniques for Arabic Documents Clustering
Clustering of text documents is an important technique for documents retrieval. It aims to organize documents into meaningful groups or clusters. Preprocessing text plays a main role in enhancing clustering process of Arabic documents. This research examines and compares text preprocessing techniques in Arabic document clustering. It also studies effectiveness of text preprocessing techniques: ...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کامل